Search Result

Select

Causal inference method based on confounder hidden compact representation model

CAI Ruichu, BAI Yiming, QIAO Jie, HAO Zhifeng

Journal of Computer Applications 2021, 41 (10): 2793-2798. DOI: 10.11772/j.issn.1001-9081.2020122066

Abstract （424）

PDF （553KB）（480）

Save

Causal inference methods can be used to discover causal relationships on observation data. When making causal inferences on data having causal structure with confounder, wrong causal relationships may be obtained under the influence of confounders. To solve the problem, a causal inference method based on Confounder Hidden Compact Representation (CHCR) model was proposed. Firstly, the candidate models with intermediate hidden variables that compactly represented the cause variables were constructed based on CHCR model. Secondly, the Bayesian Information Criterion (BIC) was used to calculate the scores of the candidate models and obtain the best model with the highest score. Finally, the real causal relationship between the variables was judged according to the quality of compaction in the best model. Theoretical analysis shows that, the proposed method can identify the causal structures with confounders that cannot be correctly identified by the classical constraint-based methods. In some cases such as the small sample size, BIC scoring can also improve the performance of the proposed method. Experimental results show that, when the number of samples changes, the proposed method has a significant improvement in accuracy compared with the classical methods such as Really Fast Causal Inference algorithm (RFCI), and the proposed method is suitable for situations with different numbers of possible variable values. When mixing different types of causal structures, the accuracy of the proposed method is higher than those of the classical methods such as Max-Min Hill-Climbing algorithm (MMHC). Moreover, the proposed method can obtain the correct causal relationships on Abalone dataset.

Reference | Related Articles | Metrics

Select

Dynamic recommendation algorithm for group-users' temporal behaviors

WEN Wen, LIU Fang, CAI Ruichu, HAO Zhifeng

Journal of Computer Applications 2021, 41 (1): 60-66. DOI: 10.11772/j.issn.1001-9081.2020061010

Abstract （335）

PDF （1014KB）（520）

Save

Focusing on the issue that the user preferences change with time in the real system, and a user ID may be shared by multiple members of a family, a dynamic recommendation algorithm for the group-users who contained multiple types of members and have preferences varying with time was proposed. Firstly, it was assumed that the user's historical behavior data were composed of exposure data and click data, and the current member role was discriminated by learning the role weights of all types of members of the group-user at the present moment. Secondly, two design ideas were proposed according to the exposure data to construct a popularity model, and the training data were balanced by adopting the inverse propensity score weighting. Finally, the matrix factorization technique was used to obtain the user latent preference factor varying with time and the item latent attribute factor, and the inner products of the former and the latter were calculated to obtain the Top- K preference recommendations of the user which vary with time. Experimental results show that the proposed algorithm not only outperforms the benchmark method at least 16 moments in 24 moments a day on three metrics of Recall, Mean Average Precision (MAP), and Normalized Discounted Cumulative Gain (NDCG), but also shortens the running time and reduces the time complexity of calculation.

Reference | Related Articles | Metrics

Select

Improved block diagonal subspace clustering algorithm based on neighbor graph

WANG Lijuan, CHEN Shaomin, YIN Ming, XU Yueying, HAO Zhifeng, CAI Ruichu, WEN Wen

Journal of Computer Applications 2021, 41 (1): 36-42. DOI: 10.11772/j.issn.1001-9081.2020061005

Abstract （308）

PDF （1491KB）（613）

Save

Block Diagonal Representation (BDR) model can efficiently cluster data by using linear representation, but it cannot make good use of non-linear manifold information commonly appeared in high-dimensional data. To solve this problem, the improved Block Diagonal Representation based on Neighbor Graph (BDRNG) clustering algorithm was proposed to perform the linear fitting of the local geometric structure by the neighbor graph and generate the block-diagonal structure by using the block-diagonal regularization. In BDRNG algorithm, both global information and local data structure were learned at the same time to achieve a better clustering performance. Due to the fact that the model contains the neighbor graph and non-convex block-diagonal representation norm, the alternative minimization was adopted by BDRNG to optimize the solving algorithm. Experimental results show that:on the noise dataset, BDRNG can generate the stable coefficient matrix with block-diagonal form, which proves that BDRNG is robust to the noise data; on the standard datasets, BDRNG has better clustering performance than BDR, especially on the facial dataset, BDRNG has the clustering accuracy 8% higher than BDR.

Reference | Related Articles | Metrics

Select

Node classification method in social network based on graph encoder network

HAO Zhifeng, KE Yanrong, LI Shuo, CAI Ruichu, WEN Wen, WANG Lijuan

Journal of Computer Applications 2020, 40 (1): 188-195. DOI: 10.11772/j.issn.1001-9081.2019061116

Abstract （834）

PDF （1280KB）（485）

Save

Aiming at how to merge the nodes' attributes and network structure information to realize the classification of social network nodes, a social network node classification algorithm based on graph encoder network was proposed. Firstly, the information of each node was propagated to its neighbors. Secondly, for each node, the possible implicit relationships between itself and its neighbor nodes were mined through neural network, and these relationships were merged together. Finally, the higher-level features of each node were extracted based on the information of the node itself and the relationships with the neighboring nodes and were used as the representation of the node, and the node was classified according to this representation. On the Weibo dataset, compared with DeepWalk model, logistic regression algorithm and the recently proposed graph convolutional network, the proposed algorithm has the classification accuracy greater than 8%; on the DBLP dataset, compared with multilayer perceptron, the classification accuracy of this algorithm is increased by 4.83%, and is increased by 0.91% compared with graph convolutional network.

Reference | Related Articles | Metrics

Select

Application of asymmetric information in link prediction

XIE Rui, HAO Zhifeng, LIU Bo, XU Shengbing

Journal of Computer Applications 2018, 38 (6): 1698-1702. DOI: 10.11772/j.issn.1001-9081.2017102467

Abstract （325）

PDF （941KB）（263）

Save

The prediction accuracy of link prediction based on node similarity is always reduced without considering the asymmetric information. In order to solve the problem, a novel method for node similarity measurement with asymmetric information was proposed. Firstly, the disadvantage of the similarity measure algorithm based on Common Neighbor (CN) was analyzed, which it only considered the number of CNs without considering the number of all neighbors of each node. Secondly, the similarity measure between nodes was defined as the ratio of the common nodes to all the neighbor nodes. Then, the symmetric similar information and the asymmetric similar information between nodes were combined, and the similarity between nodes was described in detail. Finally, the proposed method was applied to predict the link relationship in complex networks. The experimental results on the real datasets show that, compared with the previous common neighbor-based similarity measurement methods such as CN, Adamic Adar (AA) and Resource Allocation (RA), the proposed method can improve the accuracy of node similarity measurement and improve the accuracy of link relationship prediction in complex networks.

Reference | Related Articles | Metrics

Select

Performance optimization of wireless network based on canonical causal inference algorithm

HAO Zhifeng, CHEN Wei, CAI Ruichu, HUANG Ruihui, WEN Wen, WANG Lijuan

Journal of Computer Applications 2016, 36 (8): 2114-2120. DOI: 10.11772/j.issn.1001-9081.2016.08.2114

Abstract （612）

PDF （1089KB）（589）

Save

The existing wireless network performance optimization methods are mainly based on the correlation analysis between indicators, and cannot effectively guide the design of optimization strategies and some other interventions. Thus, a Canonical Causal Inference (CCI) algorithm was proposed and used for wireless network performance optimization. Firstly, concerning that wireless network performance is usually presented by numerous correlated indicators, the Canonical Correlation Analysis (CCA) method was employed to extract atomic events from indicators. Then, typical causal inference method was conducted on the extracted atomic events to find the causality among the atomic events. The above two stages were iterated to determine the causal network of the atomic events and provided a robust and effective basis for wireless network performance optimization. The validity of CCI was indicated by simulation experiments, and some valuable causal relations of wireless network indicators were found on the data of a city's more than 30000 mobile base stations.

Reference | Related Articles | Metrics

Select

Emotion classification for news readers based on multi-category semantic word clusters

WEN Wen, WU Biao, CAI Ruichu, HAO Zhifeng, WANG Lijuan

Journal of Computer Applications 2016, 36 (8): 2076-2081. DOI: 10.11772/j.issn.1001-9081.2016.08.2076

Abstract （619）

PDF （966KB）（494）

Save

The analysis and study of readers' emotion is helpful to find negative information of the Internet, and it is an important part of public opinion monitoring. Taking into account the main factors that lead to the different emotions of readers is the semantic content of the text, how to extract semantic features of the text has become an important issue. To solve this problem, the initial features related to the semantic content of the text was expressed by word2vec model. On the basis of that, representative semantic word clusters were established for all emotion categories. Furthermore, a strategy was adopted to select the representative word clusters that are helpful for emotion classification, thus the traditional text word vector was transformed to the vector on semantic word clusters. Finally, the multi-label classification was implemented for the emotion label learning and classification. Experimental results demonstrate that the proposed method achieves better accuracy and stability compared with state-of-the-art methods.

Reference | Related Articles | Metrics

Select

Selective K-means clustering ensemble based on random sampling

WANG Lijuan HAO Zhifeng CAI Ruichu WEN Wen

Journal of Computer Applications 2013, 33 (07): 1969-1972. DOI: 10.11772/j.issn.1001-9081.2013.07.1969

Abstract （928）

PDF （655KB）（489）

Save

Without any prior information about data distribution, parameter and the labels of data, not all base clustering results can truly benefit for the combination decision of clustering ensemble. In addition, if each base clustering plays the same role, the performance of clustering ensemble may be weakened. This paper proposed a selective K-means clustering ensemble based on random sampling, called RS-KMCE. In RS-MKCE, random sampling can avoid local minimum in the process of selecting base clustering subset for ensemble. And the defined evaluation index according to diversity and accuracy can lead to a better base clustering subset for improving the performance of clustering ensemble. The experiment results on two synthetic datasets and four UCI datasets show that performance of the proposed RS-KMCE is better than K-means, K-means clustering ensemble, and selective K-means clustering ensemble based on bagging.

Reference | Related Articles | Metrics